98 research outputs found

    Analysis of a social webquest fors statistics in engineering

    Get PDF
    A webquest for students of an introductory course in Statistics in Industrial Design Engineering is presented and analyzed. In order to make students work on the most basic concepts (sampling and description of data) and thinking skills, I have designed a webquest [1] where they are critical citizens. The webquest starts with a video from the outstanding series “Against All Odds: Inside Statistics” [2], Program 3, Describing Distributions, which shows how a local government used statistical methods to correct inequality between men’s and women’s salaries in the USA in the 1980's. Students should investigate the current wage gap, carrying out and analyzing a small survey, and using the official data of the INE (Spanish National Statistics Institute), where one of the variables they should collect and examine is the number of hours devoted to work at home. This webquest is designed to educate students for equality. Students also research a brilliant woman who used statistics to save many lives: Florence Nightingale. The other activities in the webquest also tackle current issues, such as climate change, through ecological footprints, and statistics in daily life, for example in clinical analysis, incorrect use of statistics in the media, or writing a letter to the city council to correct a high water bill based on incorrect water consumption estimates (this activity is based on a true story in which the citizen received a refund). The webquest also has research activities to show the importance of statistics in industrial design, and specifically, in ergonomics. These activities involve the description of relationships: regression line and principal component analysis for the accommodation of pilots in aircraft design. I analyze the data collected from my students about their experience and opinion of the webques

    Intervention in prediction measure: a new approach to assessing variable importance for random forests

    Get PDF
    Background Random forests are a popular method in many fields since they can be successfully applied to complex data, with a small sample size, complex interactions and correlations, mixed type predictors, etc. Furthermore, they provide variable importance measures that aid qualitative interpretation and also the selection of relevant predictors. However, most of these measures rely on the choice of a performance measure. But measures of prediction performance are not unique or there is not even a clear definition, as in the case of multivariate response random forests. Methods A new alternative importance measure, called Intervention in Prediction Measure, is investigated. It depends on the structure of the trees, without depending on performance measures. It is compared with other well-known variable importance measures in different contexts, such as a classification problem with variables of different types, another classification problem with correlated predictor variables, and problems with multivariate responses and predictors of different types. Results Several simulation studies are carried out, showing the new measure to be very competitive. In addition, it is applied in two well-known bioinformatics applications previously used in other papers. Improvements in performance are also provided for these applications by the use of this new measure. Conclusions This new measure is expressed as a percentage, which makes it attractive in terms of interpretability. It can be used with new observations. It can be defined globally, for each class (in a classification problem) and case-wise. It can easily be computed for any kind of response, including multivariate responses. Furthermore, it can be used with any algorithm employed to grow each individual tree. It can be used in place of (or in addition to) other variable importance measures.This work has been partially supported by Grant DPI2013- 47279-C2-1- R from the Spanish Ministerio de Economía y Competitividad. The funders played no role in the design or conclusions of this study

    Glossari: Dubtes preguntats altres anys

    Get PDF
    IG23: Ampliació d' EstadísticaTots els dubtes preguntats altres anys per temes (castellà i valencià

    Mapping the Asymmetrical Citation Relationships Between Journals by h-Plots

    Get PDF
    I propose the use of h-plots for visualizing the asymmetric relationships between the citing and cited profiles of journals in a common map. With this exploratory tool, we can understand better the journal's dual roles of citing and being cited in a reference network. The h-plot is introduced and its use is validated with a set of 25 journals belonging to the statistics area. The relatedness factor is considered for describing the relations of citations from a journal “i” to a journal “j,” and the citations from the journal “j” to the journal “i.” More information has been extracted from the h-plot, compared with other statistical techniques for modelling and representing asymmetric data, such as multidimensional unfolding

    Functional archetype and archetypoid analysis

    Get PDF
    Archetype and archetypoid analysis can be extended to functional data. Each function is approximated by a convex combination of actual observations (functional archetypoids) or functional archetypes, which are a convex combination of observations in the data set. Well-known Canadian temperature data are used to illustrate the analysis developed. Computational methods are proposed for performing these analyses, based on the coefficients of a basis. Unlike a previous attempt to compute functional archetypes, which was only valid for an orthogonal basis, the proposed methodology can be used for any basis. It is computationally less demanding than the simple approach of discretizing the functions. Multivariate functional archetype and archetypoid analysis are also introduced and applied in an interesting problem about the study of human development around the world over the last 50 years. These tools can contribute to the understanding of a functional data set, as in the classical multivariate case.This work has been partially supported by Grant DPI2013-47279-C2-1-R

    Shape Descriptors for classification of functional data

    Get PDF
    Curve discrimination is an important task in engineering and other sciences. We propose several shape descriptors for classifying functional data, inspired by form anal- ysis from the image analysis eld: statistical moments, coe cients of the components of independent component analysis (ICA) and two mathematical morphology descrip- tors (morphological covariance and spatial size distributions). They are applied to three problems: an arti cial problem, a speech recognition problem and a biomechan- ical application. Shape descriptors are compared with other methods in the literature, obtaining better or similar performance

    Morphological analysis of cells by means of an elastic metric in the shape space

    Get PDF
    Shape analysis is of great importance in many fields, such as computer vision, medical imaging, and computational biology. This analysis can be performed considering shapes as closed planar curves in the shape space. This approach has been used for the first time to obtain the morphological classification of erythrocytes in digital images of sickle cell disease considering the shape space S1, which has the property of being isometric to an infinite-dimensional Grassmann manifold of two-dimensional subspaces (Younes et al., 2008), without taking advantage of all the features offered by the elastic metric related to the possibility of stretching and bending of the curves. In this paper, we study this deformation in the shape space, S2, which is based on the representation of closed planar curves by means of the square-root velocity function (SRVF) (Srivastava et al., 2011), using the elastic metric of this space to obtain more efficient geodesics and geodesic lengths between planar curves. Supervised classification with this approach achieved an accuracy of 94.3%, classification using templates achieved 94.2% and unsupervised clustering in three groups achieved 94.7%, considering three classes of erythrocytes: normal, sickle, and with other deformations. These results are better than those previously achieved in the morphological analysis of erythrocytes and the method can be used in different applications related to the treatment of sickle cell disease, even in cases where it is necessary to study the process of evolution of the deformation, something that can not be done in a natural way in the feature space

    Ampliación de estadística para la Ingeniería Técnica en Informática de Gestión

    Get PDF
    Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG23: Ampliació d'Estadístic

    Estadística bàsica per a l'Enginyeria Tècnica en Informàtica de Gestió

    Get PDF
    Enginyeria Tècnica en Informàtica de Gestió (Pla de 2001). IG12: Estadístic
    corecore